Variational Autoencoders
A Variational Autoencoder (VAE) is a generative model that uses neural networks to encode input data into a latent space and then decodes it back to reconstruct the original data. VAEs combine principles from deep learning and probabilistic graphical models, enabling unsupervised learning of complex data distributions.
Architecture
The VAE consists of three main components:
Encoder
- Transforms input data into a latent representation .
- Outputs the parameters of the approximate posterior distribution , typically the mean and the log-variance .
- Implemented as a neural network parameterized by .
Latent Space
- A lower-dimensional space representing the encoded features of the input data.
- Imposes a prior distribution , usually a standard normal distribution .
- Enables sampling and generation of new data instances.
Decoder
- Reconstructs the input data from the latent representation .
- Defines the likelihood of the data given the latent variables.
- Implemented as a neural network parameterized by .
Mathematical Formulation
The VAE optimizes the Evidence Lower Bound (ELBO) on the marginal likelihood:
Where:
- : Approximate posterior distribution.
- : Likelihood of the data given the latent variables.
- : Kullback-Leibler divergence between two distributions.
Loss Function
The loss function combines two terms:
-
Reconstruction Loss ():
Measures how well the decoder reconstructs the input data.
-
Regularization Term ():
Encourages the latent distribution to be close to the prior .
Reparameterization Trick
To enable backpropagation through stochastic variables, the reparameterization trick is used: